
determining a maximum value and a minimum value of a function obtained in 
this way; and 

calculating thresholds that serve as a basis for distinguishing between text 
line and text interspace calculated based on these extremes; 

5 ascertaining a line interspace when said function has a combination of a 

maximum with a minimum in which said minimum has a value of less than 

function minimum + number of pixels over the width of the image excerpt/1 5 + 
2*number of pixels over the width of the image excerpt/1 5 * function 
^c-n 1 1> maximum/number of pixels over the width of the image excerpt, and 

10 a decrease in the function values after the maximum has a value of greater 

a 

ipj than (function maximum - function minimum)/2. 

m 
□ 

in 2) (Amended) The method according to claim 1 , further comprising the step 

rU of: 

© determining, in order to ascertain a left-hand edge of a line, said brightness 

distribution of a captured image excerpt along a horizontal and a function obtained in 
this way represents a beginning of a line by an abrupt rise in said function value. 



o 

y 

3) (Amended) A method according to claim 1 , further comprising the step of 
20 determining, after a position of a line has initially been ascertained, a further profile 
of said line by evaluating information concerning text characters reco gnized 



REMARKS 

The present Amendment revises the specification and claims to conform to 
25 United States patent practice, before examination of the present PCT application in 
the United States National Examination Phase. Pursuant to 37 CFR 1.125 (b), 
applicants have concurrently submitted a substitute specification, excluding the 
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claims, and provided a marked-up copy. All of the changes are editorial and 
applicant believes no new matter is added thereby. The amendment, addition, 
and/or cancellation of claims is not intended to be a surrender of any of the subject 
matter of those claims. 

Early examination on the merits is respectfully requested. 
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This redlined draft, generated by CompareRite (TM) - The Instant Redliner, shows 
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OF TEXT LINES\AMENDED CLAIMS.DOC 

CompareRite found 36 change(s) in the text 

Deletions appear as Overstrike text surrounded by Q 
Additions appear as Bold-Underline text 



1 ) [M e thodl (Amended) A method for determining the position of text lines in 
text recognition tasks, [wh e r e by th e I com prising the steps of: 

determining a brightness distribution of an acquired image excerpt along 
{the} a vertical [is d e t e rmin e d] by histogram formation along {the} said Iines{r^af*4 



smoothing said brightness distribution is smoothed [, wh e r e by] ; 

determining a maximum value and a minimum value of {the} a function 
obtained in this way [ar e d e t e rm i n e d,] ; and 

calculating thresholds that serve as {the} a basis for distinguishing between 
text line and text interspace fafe} calculated based on [th e bas is of] these extremes^ 
charact e r i z e d in that] ; 

ascertaining a line interspace [ i s asc e rta i n e d] when {the} said function has a 
combination of a maximum with a minimum in which {the} said minimum has a value 
of less than 

function minimum + number of pixels over the width of the image excerpt/15 + 
2*number of pixels over the width of the image excerpt/15 * function 
maximum/number of pixels over the width of the image excerpt, and 



thi rl * 

hi I I WJJ_ 
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{the} a decrease in the function values after the maximum has a value of 
greater than (function maximum - function minimum)/2. 

2) FM e thod K Amended) The method according to [on e of th e C l aims 1 , 
5 charact e riz e d i n that! claim 1, further comprising the step of: 

determining, in order to ascertain {the} a left-hand edge of a line, {the} said 
brightness distribution of a captured image excerpt along {the} a horizontal {is 
d e t e rmin e d] and {the} a function obtained in this way represents {the} a beginning of 
a line by an abrupt rise in {the} said function value. 

10 

p 

^? 3) fM e thodl (Amended) A method according to [on e of C l a i ms 1or 2, 

iti 

ill charact e riz e d i n that aft e r th e ] claim 1, further comprising the step of 

m determining, after a position of a line has initially been ascertained, {the} a further 

profile of {the} said line [ i s d e t e rm i n e d] by evaluating {the} information concerning 
?5 {the} text characters recognized. 

SI 

as ; 

5 £ 3 

n 

in! 
5 . 

¥** 



Preliminary Amendment A 



d^fgenerated by CompareRite (TM) - T^fe? 



This redlined d^Fgenerated by CompareRite (TM) - TfWhstant Redliner, shows 
the differences between - 

original document : Q:\DOCUMENTS\YEAR 2001\P010104-AIGNER-POSITION 
OF TEXT LINES\ORIGINAL SPECIFICATION.DOC 

5 and revised document: Q:\DOCUMENTS\YEAR 2001\P010104-A!GNER-POSITION 
OF TEXT LINES\SUBSTITUTE SPECIFICATION.DOC 

CompareRite found 91 change(s) in the text 

10 Deletions appear as Overstrike text surrounded by [] 
Additions appear-as Bold-Underline text 



[M e thod for d e t e rm i ning th e po s ition of t e xt li n es i n t e xt r e cognition tasks 

SPECIFICATION 
15 TITLE 

METHOD FOR DETERMINING THE POSITION OF TEXT LINES IN TEXT RECOGNITION TASKS 

BACKGROUND OF THE INVENTION 

Background of the Invention 

roOOH The invention relates to a method for determining the position of text lines in text 

2 0 recognition tasks [, wh e r e by] in which the brightness distribution of an acquired image excerpt along 
the vertical is determined by histogram formation along the lines, and this brightness distribution is 
smoothed !", wh e r e bvl . A maximum value and minimum value of the function obtained in this way are 
determined, and thresholds that serve as the basis for distinguishing between text line and text 
interspace are calculated on the basis of these extremes. 

2 5 Field of the invention 

f00021 In the case of the automatic recognition of texts, that is to say in the case of the 

conversion of the graphical information of a document into text characters which can be further 
processed by means of electronic text processing programs, an essential prerequisite for a successful 
recognition operation is that the position and the size of the individual characters be determined 
30 accurately. This presupposes in turn that the position and the dimensions of the text lines be known. 

r00031 In the case of manually guided readers, moreover, the profile of the text lines in the 

captured image excerpt turns out to be non-linear. In this context, there is a need to determine the 
profile of a text line. 
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od of the species initially cited is disclosed ^J^ 3 



[00041 A^^hod of the species initially cited is disclosed BPIP 0702 329 A2. This 

publication discloses a method and an apparatus for determining the line course given handwritten 
documents. According to this publication, the picture elements are summed up line-by-line, smoothed 
and analysed for the determination of the position of the lines. 

5 SUMMARY OF THE INVENTION 

rOOOSI The invention is based on the object of improving this method. 

f00061 . This is done according to the invention by [m e an s of] a method of the type mentioned 

in the introduction [wh e r e in] in which a line interspace is identified when the function comprises a 
combination of a maximum with a minimum [, wh e r e by th e l . The minimum comprises a value less 
1 0 than function minimum + plurality of picture elements over the width of the image excerpt/15 + 
2*plurality of the picture elements over the width of the image excerpt / 15* function maximum / 
plurality of picture elements over the width of the image excerpt and the drop off of the function values 
after the maximum comprises a value greater than (function maximum - function minimum)/!. This 
embodiment has proven itself in practice on the basis of very good results. 

1 5 [Advantag e i s afford e d by al f 00071 An advantageous refinement of the method [wh e r ei n] is 
provided in which, in order to ascertain the left-hand edge of a line, the brightness distribution of a 
captured image excerpt along the horizontal is determined and the function obtained in this way 
represents the beginning of a line by an abrupt rise in the function value. The beginning of a line can 
thus be determined in a simple manner with little complexity. Furthermore, for the determination of the 

20 position of the text lines, it can be ensured that in this case only images which actually contain text 
lines are taken into consideration and a user error, such as e.g. x positioning the reading pen too far to 
the left of the beginning of a line, does not influence the determination of the line. 

[00081 It is expedient if, after the position of a line has initially been ascertained, the further 

course of the [ s a i d] line is determined by evaluating the information concerning the text characters 
2 5 recognized. Evaluating the results of the character classification enables the line profile to be 
determined particularly accurately. 

BRIEF DESCRIPTION OF THE DRAWINGS 

r00Q91 The invention is explained in more detail with reference to [figur e s, i n which, by way 

of e xamp le : 

30 I the following exemplary figures. 

f 001 01 Figure 1 [ 6 how 6 ] is a diagram of a screen shot showing a text excerpt of the kind 

that is typically captured by a manually guided reader, and also the histogram determined [th e r e from] 
from it; and 

r00111 Figure 2 shows the filtered histogram with the parameters entered for the assessment 

35 of the image. 
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rooi2i 



DETAILED DESCRIPTION OF THE INV^PiON 

The sequence of the method according to the invention is as follows: 



f00131 A line histogram is determined for the captured image excerpt. In this case, for each 

line, the values of all the pixels of this line (0 for white and 1 for black) are summed. The result is a 
function f(y) with 



Width 



rooi4i 

f 001 51 

rooi6i 
rooi7i 



f(x)=^(BIackPixel)M where: 



(=0 



I 



y denotes the line index of the image ; and 

Width indicates the width (number of columns) of the image excerpt. 

When a text is present, this function has a typical profile as illustrated by way of 



10 example in Figure 1. In a further step, filtering is carried out in accordance with 



+5 



£(/(* + 0*G(0) 



/•(*) = 



i=-5 



+5 



lG(i) 



f00181 

rooi9i 

f00201 

roo2ii 

15 f00221 

[00231 During the filtering operation, values are also determined for the absolute maximum 

Valuemax i.e. A the number of black pixels of the darkest line and the absolute minimum Valuemin i.e.! 
the number of black pixels in the brightest line [ a r e a ls o d e t e rmin e d. 

h 

20 [00241 Parameters for the assessment of the image are derived from these two values. {The 

said} These parameters are: 



i=-5 
where: 

y index in the line histogram; 

G weighting corresponding to an exponential smoothing curve ; and 
i index of the smoothing curve. 



100251 
[00261 
(00271 
25 (00281 
(00291 



Trough limit = (Valuemax - Valuemin)/2 

but at least number of pixels over the width of the image excerpt/30 

Minima edge = Valuemin + number of pixels over the width of the image excerpt/15 

but at most 2* number of pixels over the width of the image excerpt/15 

Minima threshold = minimum edge + (2*number of pixels over the width of the image 



excerpt/15 * (Valuemax/number of pixels over the width of the image excerpt)) 
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IQQ3Q1 Upmost Z'number of pixels over the width of triage excerpt/15, 

mm Using the function f (y) and the threshold values determined, as are illustrated by way 

of example in Figure 2, the captured image is then assessed with regard to the presence of text lines 
and line interspaces. 

£00321 For this purpose, the curve profile is examined to see whether values which are 

smaller than the parameter minima threshold are present. If this is the case, then the relevant area is 
qualified as a valid minimum and thus as a possible line interspace. 

£00331 An actual line interspace is present, however, only when the presence of a text line is 

indicated by an adjoining maximum with a certain characteristic value. These valid maxima are 
defined by a subsequent decreasing of the curve value by a magnitude > Trough limit. 

£00341 The coincidence of a valid maximum with a valid minimum characterizes the 

transition from a text line to a line interspace. The parameter Minima edge serves for accurately 
determining this transition. 

100351 The point at which the curve intersects this threshold between a valid maximum and a 

15 valid minimum is defined as a line edge. 

£00361 m order to determine the left-hand edge of a line, a column histogram is created in 

accordance with 



10 



Height H 

roo37i /(*)= Zi(BlackPixel)m 

i=0 | 



r00381 where 
20 [00391 x column [* ...column] index of the image excerpt ; and 

r00401 Height Mmage} image height 

£00411 in words the colour information of the pixels of each column of the captured image 

excerpt is summed. The left-hand text edge is defined (given the presence of at least one line) by an 
abrupt rise in the function value f(x). 

25 100421 The follow-up plotting of the line s[, that i s to s ay] £1^ the information concerning the 

further profile of the lines, which is important particularly in the case of manually guided readers on 
account of the fluctuations that occur with the latterHl is effected on the basis of the position of the 
recognized characters. 

10043] For this purpose, the recognized characters are classified into the following size 

30 groups: 

£00441 Small characters (for example "a") 0.7*lrne height; 

£00451 Large characters (for example "AVg") line height; 

£00461 Oversize characters (for example T.T) line height +0.3*line height 

(descenders); 
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f00471 



I characters: the characters cannot be una 



piously assigned by size. 

KH>481 The following character groups are differentiated for the determination of the new 

lower edge of the text line: 

f00491 Baseline characters (for example "A", "."): the lower edge of the character 

5 corresponds to the lower edge of the text line, irrespective of the size of the character; 

[00501 Descender characters (for example M g"» T): the lower edge of the character 

corresponds to the descender boundary, irrespective of the size of the character; 

r00511 Special characters: these characters cannot be unambiguously assigned with regard 

to their lower edge. 

10 f00521 On the basis of these assignments and a probability value G relating to the correct 

classification of the character, [th e s a i d] this probability value being obtained in the course of the 
classification method, the new line height Height is then determined as follows: 



r00531 
r00541 

is roossi 

f00561 

[00571 

20 F00581 
100591 
£00601 
100611 

25 f00621 
r00631 
r00641 



G = Wahrscheinlichkeit * CYC_ MAX_ WEIGHT 

CYC MAX EXTRPAR 

X OldHeight[i] + NewHeight * G 



Height = ■ 
G 

Probability 



I 



CYC_ MAX_ EXTRPAR + G 
weighing of the line height derived from the current character; 

probability of correct character classification (range of values between 
0 and 1 ); 



CYC_MAX__WEIGHT maximum weighing of the new character position (for 
example: 5); 

Height subsequently plotted line height (upper case letter height); 

CYC_MAX_EXTRPAR size of the ring buffer for the averaging (for example: 3); 
OldHeightQ ring buffer; 

NewHeight line height derived from the current character (upper case letter 
height) : and 

i index in the ring buffer. 

The profile of the lower edge of the text line is determined in accordance with: 



G = Wahrscheinlichkeit + 



1 



30 f00651 



Increase = 



cyc_ max_ weight; 

Oldlncrease + Newlncrease * G 



*CYC MAX WEIGHT 



i 



i 
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f00661 

r00671 
r00681 
r00691 

5 

f0070T 

tooth 

f00721 

10 r00731 
r00741 

f00751 

15 f00761 



Base = NewBase + 



Increase* DeltaX + 50 
100 



I 




G weighing of the new character position; 

Probability probability of correct character classification; 

CYC_MAX_WEIGHT maximum weighting of the new character position (for 
example: 5); 

Increase subsequently plotted current gradient of the baseline in %; 
Oldlncrease previous gradient of the baseline in %; 

Newjncrease gradient of the base line in % calculated from the position of the 
current character; 

Base subsequently plotted baseline position (rounded to an integer value); 

NewBase baseline position calculated from the position of the current character; 
and 

DeltaX X-separation in the image between the two centre points of the 

characters extracted last. 

The "Increase" is limited by the plausibility limit C YC_M AX_LI N EOF FS ET (in the 



Pocket Reader: 15%). 
[Ab s tr a ct 

l f00771 The above-described method is illustrative of the principles of the present 

invention. Numerous modifications and adaptations will be readily apparent to those skilled in 
20 this art without departing from the spirit and scope of the present invention. 
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ABSTRACT 




f00781 A method for determining the position of text lines in text recognition tasks as 

specified in which the brightness distribution of a captured image excerpt along the vertical is 



determined and this brightness distribution is filtered !", in wh i ch! . In the method, maxima and minima 
of the function obtained in this way are determined and, on the basis of these extrema, threshold 
values are calculated which serve as a basis for distinguishing between text line and line interspace. { 

JThe method can be used particularly advantageously in the case of manually guided electronic 
readers. 
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